-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pad table headers up to max row length to avoid syntax errors, closes… #36
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pull needs a bit of work, but I think it's a good idea to pad the headers!
btw, this would close jgm/pandoc#4059
Text/Pandoc/Builder.hs
Outdated
headers' = if null headers | ||
then replicate numcols mempty | ||
else headers | ||
headers' = pad mempty numcols headers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the pad
function is only used here, I would move it here: on the same level as headers'
and numcols
.
or even better, do without it:
headers' = take numcols $ headers ++ repeat mempty
pandoc-types.cabal
Outdated
@@ -1,5 +1,5 @@ | |||
Name: pandoc-types | |||
Version: 1.17.3 | |||
Version: 1.17.4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jgm will increment the version number in a separate commit when he releases a new version... no need to do it in a pull request. (same for the changelogs AFAIK)
Text/Pandoc/Builder.hs
Outdated
@@ -305,6 +305,16 @@ setDate = setMeta "date" | |||
|
|||
-- Inline list builders | |||
|
|||
-- | Extend a list with a default element up to a maximum length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we aren't using doctest
in this project... so I'd just add a few more tests to test-pandoc-types.hs
okay i added a commit with the changes you mentioned, ready to be squashed |
If I understand correctly, this pads headers to max row length, but doesn't pad rows. Should rows be padded too if they have fewer cells than other rows or the header? |
yeah padding rows is a low-hanging fruit here, i'll add a test and the feature |
i added row padding. some tests are failing but the error seems independent from the changes i made |
If I'm not mistaken, nothing prevents having a different number of columns in the body than in the One option would be to take the cellspec as determining the number of columns, and pad or truncate all rows to that number. |
i'd say to pad also the specification. about the idea to use the specification as an indicator of the max size, i'd rather keep the current max length. i think that it's more sensible because:
|
Padding the specification is a bit tricky, though. What if
they specify two nonzero widths, but have a row with three
columns? Then we'll get a column with 0, and that will
screw up renderers, which expect either all 0s or all
nonzero values.
+++ Francesco Occhipinti [Dec 03 17 17:41 ]:
… i'd say to pad also the specification. about the idea to use the
specification as an indicator of the max size, i'd rather keep the
current max length. i think that it's more sensible because:
* the specification is metadata and the user does not see it on some
source formats
* picking the max size, no data is lost
—
You are receiving this because you commented.
Reply to this email directly, [1]view it on GitHub, or [2]mute the
thread.
References
1. #36 (comment)
2. https://github.com/notifications/unsubscribe-auth/AAAL5JXeabSZ_bQj0uCvs8hRyAKnKBazks5s8t1YgaJpZM4QpGkX
|
if this is getting tricky, i'm not sure that we should keep discussing in this context. the original motivation of this pull request is to address issue jgm/pandoc#4059 which caused syntax errors to an user. from the point of view of that user, this pull request is already solving a problem. we also thought to extend this to rows, but now it seems like our desire to improve more is preventing us to improve the experience of that user |
I don't think my proposal was tricky. The proposal is
simple: the 'table' builder takes the cellspecs to
determine the number of columns, and pads or truncates
the rows to ensure that number of columns. Presumably
it's a programming error anyway if someone includes
a row with more cells than the cellspecs.
With this change, we'd have a guarantee that any table
created with the table builder is consistent.
|
oh, i see, you were making a point about your proposal while i thought that we were stuck. let me reconsider more carefully |
442d858
to
ed19be7
Compare
ed19be7
to
e354416
Compare
i updated the pull request, now it pads or truncate depending on |
any problems with this? once merged this will enable pandoc's translation to become more robust to malformed input |
i tried to run pandoc tests using this version of |
alright it was not that bad, three tests were actually failing and here is the fix.
we have a problem similar to 3337 with the DOCX parser, i think that i will change the logic there so that the maximum width is used for the specs |
i saw that there were some updates on |
Thanks! |
Thanks to you! Let me know whether i can help updating the tests in pandoc. I remind you that some info about how to update them and a commit with the changes are in the comment above |
OK, I started looking at pandoc tests. In command test 3337, we have a table with no header and two body rows, one with two cells, one with one. The test output expects the table to be padded to two cells, but with this commit it's padded to one cell. I'm actually starting to change my mind about the decision above; maybe padding to the maximum width makes the most sense? After all, that is what HTML will do. |
OK, looking back at my reasons above, they seem sound. |
I'm going to make the needed changes in pandoc tests and pandoc. |
In jgm/pandoc-types#36 we changed the table builder to pad cells. This commit changes tests (and two readers) to accord with this behavior.
sorry for being late on this. In the case of 3337, as you said, the HTML reader is responsible for calling the builder with specs that will not lead to any data loss. Thanks for moving this forward, i'll verify that it closes #4059 |
… #4059